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I. Introduction 

Albeit the North American power grid has been recognized 
as the most important engineering achievement of the 20th 
century, the modem power grid faces major challenges llSTl . 
Increasingly complex interconnections even at the continent 
size render prevention of the rare yet catastrophic cascade 
failures a strenuous concern. Environmental incentives require 
carefully revisiting how electrical power is generated, transmit- 
ted, and consumed, with particular emphasis on the integration 
of renewable energy resources. Pervasive use of digital tech- 
nology in grid operation demands resiliency against physical 
and cyber attacks on the power infrastructure. Enhancing grid 
efficiency without compromising stability and quality in the 
face of deregulation is imperative. Soliciting consumer par- 
ticipation and exploring new business opportunities facilitated 
by the intelligent grid infrastructure hold a great economic 
potential. 

The smart grid vision aspires to address such challenges 
by capitalizing on state-of-the-art information technologies in 
sensing, control, communication, and machine learning Q, 
ll24l . The resultant grid is envisioned to have an unprece- 
dented level of situational awareness and controllability over 
its services and infrastructure to provide fast and accurate 
diagnosis/prognosis, operation resiliency upon contingencies 
and malicious attacks, as well as seamless integration of 
distributed energy resources. 

A. Basic Elements of the Smart Grid 

A cornerstone of the smart grid is the advanced monitora- 
bility on its assets and operations. Increasingly pervasive in- 
stallation of the phasor measurement units (PMUs) allows the 
so-termed synchrophasor measurements to be taken roughly 
100 times faster than the legacy supervisory control and data 
acquisition (SCADA) measurements, time-stamped using the 
global positioning system (GPS) signals to capture the grid 
dynamics. In addition, the availability of low-latency two-way 
communication networks will pave the way to high-precision 
real-time grid state estimation and detection, remedial actions 
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upon network instability, and accurate risk analysis and post- 
event assessment for failure prevention. 

The provision of such enhanced monitoring and communi- 
cation capabilities lays the foundation for various grid control 
and optimization components. Demand response (DR) aims to 
adapt the end-user power usage in response to energy pricing, 
which is advantageously controlled by utility companies via 
smart meters l|29l|. Renewable sources such as solar, wind, and 
tidal, and electric vehicles are important pieces of the future 
grid landscape. Microgrids will become widespread based on 
distributed energy sources that include distributed generation 
and storage systems. Bidirectional power flow to/from the grid 
due to such distributed sources has potentials to improve the 
grid economy and robustness. New services and businesses 
will be generated through open grid architectures and markets. 

B. SP for the Grid in a Nutshell: Past, Present, and Future 

Power engineers in the 60's were facing the problem of 
computing voltages at critical points of the transmission grid, 
based on power flow readings taken at current and voltage 
transformers. Local personnel manually collected these read- 
ings and forwarded them by phone to a control center, where a 
set of equations dictated by Kirchoff's and Ohm's laws were 
solved for the electric circuit model of the grid. However, 
due to timing misalignment, instrumentation inaccuracy, and 
modeling uncertainties present in these measurements, the 
equations were always infeasible. Schweppe and others offered 
a statistical signal processing (SP) problem formulation, and 
advocated a least-squares approach for solving it [69] — what 
enabled the power grid monitoring infrastructure used pretty 
much invariant till now 1571 . IT]. 

This is a simple but striking example of how SP expertise 
can have a strong impact in power grid operation. Moving 
from the early 70's to nowadays, the environment of the power 
system operation has become considerably more complex. 
New opportunities have emerged in the smart grid context, 
necessitating a fresh look. As will be surveyed in this article, 
modern grid challenges urge for innovative solutions that tap 
into diverse SP techniques from estimation, machine learning, 
and network science. 

Avenues where significant contribution can be made include 
power system state estimation (PSSE) in various renditions, as 
well as "bad data" detection and removal. As costly large-scale 
blackouts can be caused by rather minor outages in distant 
parts of the network, wide-area monitoring of the grid turns 
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out to be a challenging yet essential goal iTTSl . Opportunities 
abound in synchrophasor technology, ranging from judicious 
placement of PMUs to their role in enhancing observability, 
estimation accuracy, and bad data diagnosis. Unveiling topo- 
logical changes given a limited set of power meter readings is a 
critical yet demanding task. Applications of machine learning 
to the power grid for clustering, topology inference, and 
Big Data processing for e.g., load/price forecasting constitute 
additional promising directions. 

Power grid operations that can benefit from the SP expertise 
include also traditional operations such as economic dispatch, 
power flow, and unit commitment ||84l , ITOl , ||25l , as well as 
contemporary ones related to demand scheduUng, control of 
plug-in electric vehicles, and integration of renewables. Con- 
sideration of distributed coordination of the partaking entities 
along with the associated signaling practices and architectures 
require careful studies by the SP, control, and optimization 
experts. 

Without any doubt, computationally intelligent approaches 
based on SP methodologies will play a crucial role in this 
exciting endeavor From grid informatics to inference for 
monitoring and optimization tools, energy-related issues offer 
a fertile ground for SP growth whose time has come. 

The rest of the article is organized as follows. Modeling 
preliminaries for power system analysis are provided in Sec.HIl 
Sec. |III] deals with the monitoring aspect, delineating various 
SP-intensive topics including state estimation and PMUs, 
as well as the inference, learning and cyber-security tasks. 
Section |IV] is devoted to grid optimization issues, touching 
upon both traditional problems in economic power system 
operations, as well as more contemporary topics such as 
demand response, electric vehicles, and renewables. The article 
is wrapped up with a few open research directions in Sec. |V] 

II. Modeling Preliminaries 

Power systems can be thought of as electric circuits of even 
continent-wide dimensions. They obey multivariate versions 
of Kirchoff's and Ohm's laws, which in this section are 
overviewed using a matrix-vector notation. As the focus is laid 
on alternating current (AC) circuits, all electrical quantities 
involved (voltage, current, impedance, power) are complex- 
valued. Further, quantities are measured in the per unit (p.u.) 
system, which means that they are assumed properly normal- 
ized. For example, if the "base voltage" is 138 kV, then a bus 
voltage of 140 kV is 1.01 p.u. The p.u. system enables uniform 
single- and three-phase system analysis, bounds the dynamic 
range of calculations, and allows for uniform treatment over 
the different voltage levels present in the power grid ll84l . 1251 . 

Consider first a power system module of two nodes, m and 
n, connected through a line. A node, also referred to as a bus 
in the power engineering nomenclature, can represent, e.g., a 
generator or a load substation. A line (a.k.a. branch) can stand 
for a transmission or distribution line (overhead/underground), 
or even a transformer Two-node connections can be repre- 
sented by the equivalent tt model depicted in Fig. [11 11991 . 1571 . 
which entails the line series impedance z,„„ := l/umn and 
the total charging susceptance bc.mn- The former comprises 




Fig. 1. Equivalent n model for a transmission line; yellow box when an 
ideal transformer is also present [cf. ( |10H . 

a resistive part r^n and a reactive (actually inductive) one 
Xmn > 0, that is Zmn = fmn + jxmn- The line series 
admittance y.^n ■= ^/z-mn = 9mn + jb,nn is often used in 
place of the impedance. Its real and imaginary parts are called 
conductance and susceptance, respectively. Letting Vm denote 
the complex voltage at node m, Xmn the current flowing from 
node 771 to n, and invoking Ohm's and Kirchoff's laws on the 
circuit of Fig. [T] yields 

The reverse-direction current I„,„ is expressed symmetrically. 
Unless bc.nm is zero, it holds that ^ —Imn- A small 
shunt susceptance bs,mm is typically assumed between every 
node m and the ground (neutral), yielding the current Imm = 

3,7117X1 

Building on the two-node module, consider next a power 
system consisting of a set M of Nf, buses along with a set 
£ of Ni transmission lines. By Kirchoff's current law, the 
complex current at bus m denoted by Im must equal the sum 
of currents on the lines incident to bus m; that is, 

I ^ ^ Vnin ^ Vnun J "^m ^ ^ V^nri^n (2) 

where 7V,„ is the set of buses directly connected to bus m, and 
Vmm := j (^s,mm + I]„gM„ ^c,nm/2) := ib„„n. Collecting 
node voltages (currents) in the A^;, x 1 vector v (i), leads to 
the multivariate Ohm's law 

i = Yv (3) 
where Y £ ((^A'bxJVi, j-j^g so-termed bus admittance ma- 
trix with (m, m)-th diagonal entry I],igA^„ 2^™" + 
(m, n)-th off-diagonal entry —ymn if ?t- G Mm, and zero oth- 
erwise (cf. (IDi). Matrix Y is symmetric and more importantly 
sparse, thus facilitating efficient storage and computations. 
On the contrary, the bus impedance matrix Z, defined as the 
inverse of Y (and not as the matrix of bus pair impedances), 
is full and therefore it is seldom used. 

A major implication of (O is control of power flows. Let 
Sm '■= Pm + iQm be the complex power injected at bus m 
whose real and imaginary parts are the active (reactive) power 
Pm (Qm)- Physically, Sm represents the power generated 
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and/or consumed by plants and loads residing at bus m. 
For bus m and with * denoting conjugation, it holds that 
Sm = Vmlm^ oi after collecting all power injections in 
(diag(v) denotes a diagonal matrix holding v on 



s G 



its diagonal) one arrives at (cf. (|3}) 

s = diag(v)i* = diag(v)Y*v* 



(4) 



Complex power flowing from bus m to a neighboring bus n 
is similarly given by 



S„ 



(5) 



The ensuing analysis pertains mainly to nodal quantities. 
However, line quantities such as line currents and power flows 
over lines can be modeled accordingly using ([TJ and (|5). 

Typically, the complex bus admittance matrix is written in 
rectangular coordinates as Y = G+ jB. Two options become 
available from (|4]i, depending on whether the complex nodal 
voltages are expressed in polar or rectangular forms. The polar 
representation V„i = VmC^^^ yields [cf. ©I 

Pm = X V„iVn {G„in COS Omn + B„m Sin e„^n) (6a) 



E 

n=l 



VmVn {Grnn siu O^nn - ^mn COS 0mn) (6b) 



where 6mn '■— — 0n Vm. Since Pm and Qm depend on 
phase differences {9,nn}, power injections {Sm} are invariant 
to phase shifts of bus voltages. This explains why a selected 
bus called the reference, slack, or swing bus is conventionally 
assumed to have zero voltage phase without loss of generality. 

If Y is known, the 2Ni, equations in (|6]l involve the variables 
{Pni,Qm,Vra,Ora}m=i- Among the ANb nodal variables, (i) 
the reference bus has fixed {V,n,d,n)\ (ii) pairs {Pm,Vr,i) 
are controlled at generator buses (and are thus termed PV 
buses); while, (iii) power demands {P,n, Qm) are predicted for 
load buses (also called PQ buses). Fixing these 2Ni, variables 
and solving the non-linear equations ^ for the remaining 
ones constitutes the standard power flow problem fS?] Ch. 4]. 
Algorithms for controlling PV buses and predicting load at PQ 
buses are presented in Sec. II V- Al and Sec. IIII-D3I respectively. 

Pairs {P,nn,Vmn) Satisfying (approximately) power flow 
equations paralleling (|6]l can be found in 1251 Ch. 3]. Among 
the approximations of the latter as well as (|6]l, the so called 
DC model is reviewed next due to its importance in grid 
monitoring and optimization. The DC model hinges on three 
assumptions: 

(Al) The power network is purely inductive, which means that 
rjnn is negligible. In high-voltage transmission lines, the ratio 
Xmn/rjnn = -bmn/Qmn IS large cnough SO that resistances 
can be ignored and the conductance part G of Y can be 
approximated by zero; 

(A2) In regular power system conditions, the voltage phase 
differences across directly connected buses are small; thus, 
9,nn — for every pair of neighboring buses (m, n), and the 
trigonometric functions in (|6) are approximated as sin 6„in — 

dm — On and cos6'„i„ ~ 1; and 

(A3) Due to typical operating conditions, the magnitude of 
nodal voltages is approximated by one p.u. 



Under (A1)-(A3) and upon exploiting the structure of B (cf. 
(|3]l), the model in ^ boils down to 



P, 



(7a) 



Qm = -bmm - ^ (Kn " K) (7b) 

n^m 

where bmn = ^^/xmn is the susceptance of the {m,n) 
branch, and in deriving (|7), approximation of nodal voltage 
magnitudes to unity implies VmVn — 1, yet Vm (K« — Vn) — 

Vm - Vn. 

The DC model (|7|i entails linear equations that are neatly 
decoupled: active powers depend only on voltage phases, 
whereas reactive powers are solely expressible via voltage 
magnitudes. Furthermore, the linear dependence is on voltage 
differences. In fact, since Pmn = ~bmn{Sm — ^n) and 
bmn < 0, active power flows across lines from the larger- 
to the smaller-voltage phase buses. 

Consider now the active subproblem described by (l7at . 
Stacking the nodal real power injections in p e W^'' and the 
nodal voltage phases in 6 G M^'', leads to 



(8) 



where the symmetric B^^ is defined similar to Y by 
only accounting for reactances. Specifically, [Bj^Jmm := 

EneA/"™ ^mi for all m, and [B^]mn ■= -Xmn, if {m,n) line 
exists, and zero otherwise. 

An alternative representation of B^; is presented next. Define 
matrix D := diag ({xj^^j/gg), and the branch-bus Ni x Nb 
incidence matrix A, such that if its ^-th row af corresponds 
to the {m,n) branch, then [a.i]m := +1, [a;]„ —1, and 
zero elsewhere. Based on these definitions, Bj; — A^DA 
can be viewed as a weighted Laplacian of the graph {J\f, £) 
describing the power network. This in turn implies that B^; 
is positive semidefinite, and the all-ones vector 1 lies in its 
null space. Further, its rank is {Nb — 1) if and only if the 
power network is connected. Since B^l = 0, it follows that 
p^l = 0; stated differently, the total active power generated 
equals the active power consumed by all loads, since resistive 
elements and incurred thermal losses are ignored. 

As a trivia, the terminology DC model stems from the fact 
that ([8]) models the AC power system as a purely resistive DC 
circuit by identifying the active powers, reactances, and the 
voltage phases of the former to the currents, the resistances, 
and the voltages of the latter 

Coming back to the exact power flow model of (|4|i, consider 
now expressing nodal voltages in rectangular coordinates. If 
Vm = Vrjn + jVi,m foT all buscs, it follows that 



Pm — Vj-^m- ^ ^ (Vr.nGmn ^,n-^?nn) 
n=l 

~t~ Vi^m ^ ^ {Vi,nGm-7i ^" Vf^nPmi 

71 = 1 



(9a) 



ri=l 
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(9b) 



Based on ( |9a] l and ( |9b] i. it is clear that (re)active power 
flows depend quadratically on the rectangular coordinates of 
nodal voltages. Because (|9]l is not amenable to approximations 
invoked in deriving (|6]l, the polar representation has been 
traditionally preferred over the rectangular one. 

Before closing this section, a few words are due on mod- 
eling transformers that were not explicitly accounted so far 
Upon adding the circuit surrounded by the yellow square to 
the model of Fig. [T] the possibility of having a transformer 
on a branch is considered in its most general setting ||251 . 
||99l . An ideal transformer residing on the {m,n) line at the 
m-th bus side yields V™ = Vm' Pmn and = p*^^Imm', 
where pmn '■= T,nne^°''"" is its turn ratio. Hence, ([1} readily 
generalizes to 



T 



ymn+jbc.n.„/2 



P'mr 

+ jbc 



Vn 



(10) 

Using ( [Tol l in lieu of ([T]i, a similar analysis can be followed 
with the exception that in the presence of phase shifters, the 
corresponding bus admittance matrix Y will not be symmetric. 
Note though that the DC model of ([8]) holds as is, since it 
ignores the effects of transformers anyway. 

The multivariate current-voltage law (cf. (|3)), the power 
flow equations (cf. (|6]l or (|9]l), along with their linear approxi- 
mation (cf. (HJ) and generalization (cf. (fTOli), will play instru- 
mental roles in the grid monitoring, control, and optimization 
tasks outlined in the ensuing sections. 

III. Grid Monitoring 

In this section, SP tools and their roles in various grid 
monitoring tasks are highlighted, encompassing state esti- 
mation with associated observability and cyber-attack issues, 
synchrophasor measurements, as well as intriguing inference 
and learning topics. 

A. Power System State Estimation 

Simple inspection of the equations in Section confirms 
that all nodal and line quantities become available if one 
knows the grid parameters {j/mn}, and all nodal voltages Vmn 
that constitute the system state. Power system state estimation 
(PSSE) is an important module in the supervisory control and 
data acquisition (SCADA) system for power grid operation. 
Apart from situational awareness, PSSE is essential in ad- 
ditional tasks, namely load forecasting, reliability analysis, 
the grid economic operations detailed in Sec. |IV] network 
planning, and billing lIZSl Ch. 4]. Building on Sec. HIl this 
section reviews conventional solutions and recent advances, 
as well as pertinent smart grid challenges and opportunities 
for PSSE. 

1 ) Static State Estimation: Meters installed across the grid 
continuously measure electric quantities, and forward them 
every few seconds via remote terminal units (RTUs) to the 
control center for grid monitoring. Due to imprecise time 



signaling and the SCADA scanning process, conventional me- 
tering cannot utilize phase information of the AC waveforms. 
Hence, legacy measurements involve (active/reactive) power 
injections and flows, as well as voltage and current magnitudes 
on specific grid points. Given the SCADA measurements 
and assuming stationarity over a scanning cycle, the PSSE 
module estimates the state, namely all complex nodal voltages 
collected in v. Recall that according to the power flow models 
presented in Sec. [Ill all grid quantities can be expressed in 
terms of v. Thus, the M xl vector of SCADA measurements 
can be modeled as z = h(v) + e, where h(-) is a properly 
defined vector-valued function, and e captures measurement 
noise and modeling uncertainties. Upon prewhitening, e can 
be assumed standard Gaussian. The maximum-likelihood es- 
timate (MLE) of V can be then simply expressed as the 
nonlinear least-squares (LS) estimate 



V :=argmin ||z — h(v)||2. 



(11) 



Prior information, such as zero-injection buses {P„i = Qm = 
0) and feasible ranges (of Vm, and 0,„), can be included as 
constraints in (fTTT i. In any case, the optimization problem is 
nonconvex. For example, when states are expressed in rectan- 
gular coordinates, the functions in h( ) are quadratic; cf. (|9). In 
general, PSSE falls under the class of nonlinear LS problems, 
for which Gauss-Newton iterations are known to offer the 
"workhorse" solution [[T] Ch. 2]. Specifically, upon expressing 
V in polar coordinates, the quadratic h(v) can be linearized 
using Taylor's expansion around a starting point. The Gauss- 
Newton method hence approximates the cost in (fTTT i with a 
linear LS one, and relies on its minimizer to initialize the sub- 
sequent iteration. This iterative procedure is closely related to 
gradient descent algorithms for solving nonconvex problems, 
which are known to encounter two issues; i) sensitivity to the 
initial guess; and ii) convergence concerns. Without guaranteed 
convergence to the global optimum, existing variants improve 
numerical stability of the matrix inversions per iteration ||T|. 
In a nutshell, the grand challenge so far remains to develop 
a solver attaining or approximating the global optimum at 
polynomial-time. 

Recently, a semidefinite relaxation (SDR) approach has been 
recognized to develop polynomial-time PSSE algorithms with 
the potential to find a globally optimal solution [95l, 1961 . 
Challenged by the nonconvexity of (fTTT i. the measurement 
model is reformulated as a linear function of the outer-product 
matrix V := vv^, where the state is now expressed in 
rectangular coordinates. This allows reformulating (ITTi to a 
semidefinite program (SDP) with the additional constraint 
rank(V) = 1. Dropping the nonconvex rank constraint to 
acquire a convex SDP has been well-appreciated in signal 
processing and communications; see e.g., 1521 . The SDR- 
based PSSE has been shown to approximate well the global 
optimum, while it is possible to further improve computational 
efficiency by exploiting the SDP problem structure ll95l . 

2) Dynamic State Estimation: As power systems evolve in 
time, dynamic PSSE is well motivated thanks to its predictive 
ability emerging when additional temporal information is 
available. In practice, it is challenged by both the unknown 
dynamics and the requirement of real-time implementation. 
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While the latter could become tractable with (extended) 
Kalman filtering (KF) techniques, it is more difficult to de- 
velop simple state-space models to capture the power system 
dynamics. 

There have been various proposals for state transition mod- 
els in order to perform the prediction step, mostly relying on 
a quasi-steady state behavior; see |j67| for a review of the 
main developments. One simplified and widely used model 
poses a "random-walk" behavior expressing the state in polar 
coordinates per time slot t as v(t + 1) = v(t) + w{t), where 
w(t) is zero-mean white Gaussian with a diagonal covariance 
matrix estimated online ll57l . A more sophisticated dynamical 
model reads v(t + 1) = F{t)v{t) + e{t) + w{t), where 
F{t) is a diagonal transition matrix and e{t) captures the 
process mismatch. Recently, a quasi-static state model has 
been introduced to determine e{t) by approximating first-order 
effects of load data jT] . 

For the correction step, the extended KF (EKF) is commonly 
used via hnearizing the measurement model around the state 
predictor ll57l . Il67l . To overcome the reduced accuracy of EKF 
linearization, unscented KF (UKF) of higher complexity has 
been reported in lISTl . Particle filtering may also be of interest 
if its computational efficiency can be tolerated by the real-time 
requirements of power systems. 

3) Distributed State Estimation: Parallel and distributed 
solvers were investigated early on ||69l . The motivation was 
primarily computational, even though additional merits of co- 
ordination across adjacent control areas were also recognized. 
In vertically integrated electricity markets, each local utility 
estimated its own state and modeled the rest of the system 
at boundary points using only local measurements. Adjacent 
power systems were connected via tie lines, which were basi- 
cally used in emergency situations, and PSSE was performed 
locally with limited interaction among control centers. 

Currently, the deregulation of energy markets has led to 
continent-wide interconnections that are split into subnetworks 
monitored by independent system operators (ISOs). Increasing 
amount of power is transferred over multiple control areas, and 
tie lines must be accurately monitored for reliability and ac- 
counting II27I . The ongoing penetration of renewables further 
intensifies long-distance power transfers, while their intermit- 
tent nature calls for frequent monitoring. Interconnection-level 
PSSE is therefore a key factor for modernizing power grids. 
Even though advanced instrumentation can provide precise and 
timely measurements (cf. Sec. llll-Cl i. an interconnection could 
consist of thousands of buses. The latter together with privacy 
policies deem decentralized PSSE a pertinent solution. 

To understand the specifications of distributed PSSE, con- 
sider the toy example of Fig. |2] Area 2 consists of buses 
{3, 4, 7, 8}, but it also collects current measurements on tie 
lines {(4, 5), (4, 9), (7, 9)}. Its control center has two options 
regarding these measurements: either to ignore them and focus 
on the internal state, or to consider them and augment its state 
by the external buses {5,9}. The first option is statistically 
suboptimal; let alone it may incur observability loss (check 
for example Area 3). For the second option, neighboring areas 
should consent on shared variables. This way, agreement is 
achieved over tie line charges and the global PSSE problem 




Fig. 2. The IEEE 14-bus power system partitioned into four areas (801. Dotted 
lassos sliow the buses belonging to extended area states. PMU bus voltage 
(line current) measurements are depicted by green circles (blue squares). 



is optimally solved. 

It was early realized that for a chain of serially intercon- 
nected areas, KF-type updates can be implemented incremen- 
tally in space ||69l Pt. 111]. For arbitrarily connected areas 
though, a two-level approach with a global coordinator is 
required ||69J; Local measurements involving only local states 
are processed to estimate the latter Local estimates of shared 
states, their associated covariance matrices, and tie line mea- 
surements are forwarded to a global coordinator The coordina- 
tor then updates the shared states and their statistics. Several 
recent renditions of this hierarchical approach are available 
under the assumption of local observability ||27l , ||28l . A 
central coordinator becomes a single point of failure, while 
the sought algorithms may be infeasible due to computational, 
communication, or policy limitations. Decentralized solutions 
include block Jacobi iterations lfT6l . and the auxiliary problem 
principle |[T9|. Local observability is waived in jsF), where a 
copy of the entire high-dimensional state vector is maintained 
per area, and linear convergence of the proposed first-order 
algorithm scales unfavorably with the interconnection size. 
A systematic framework based on the alternating direction 
method of multipliers is put forth in ll34l . Depending solely on 
existing PSSE software, it respects privacy policies, exhibits 
low communication load, and its convergence is guaranteed 
even in the absence of local observabiUty. Finally, for a survey 
on multi-area PSSE, refer to ||281 . 

4) Generalized State Estimation (G-SE): PSSE presumes 
that grid connectivity and the electrical parameters involved 
(e.g., line admittances) are known. Since these are oftentimes 
unavailable, generalized state estimation (G-SE) extends the 
PSSE task to jointly recovering them too IT] Ch. 8], JSS] 
Sec. 4.10]. PSSE operates on the bus/branch grid model; 
cf. Fig. |3(a)| A more meticulous view of this grid is offered 
by the corresponding bus section/switch model depicted in 
Fig. |3(b)| This shows how a bus is partitioned by circuit 
breakers into sections (e.g., bus 1 to sections {1,15—19}), 
or how a substation can appear as two different buses (e.g., 
sections {10,52—54} and {14,55—57} mapped to buses 10 
and 14, respectively). Circuit breakers are zero-impedance 
switching components and are used for seasonal, maintenance, 
or emergency reconfiguration of substations. For some of 
them, the status and/or the power they carry may be reported 
to the control center A topology processing unit collects this 
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(a) Bus/branch model. 




(b) Bus section/switch model 



Fig. 3. The IEEE 14-bus power system benchmark 1801 : (a) The conventional 
model, (b) An assumed substation-level model 1261 . Solid (hollow) squares 
indicate closed (open) circuit breakers. The original 14 buses preserve 
their numbering. Thick (thin) lines coiTespond to finite- (zero-)impedance 
transmission lines (circuit breaker connections). 



information and validates network connectivity prior to PSSE 

1221. 

Even though topology malfunctions can be detected by 
large PSSE residual errors, they are not easily identifiable [Q. 
Hence, joint PSSE with topology processing under the G-SE 
task has been a well-appreciated solution 1571 . G-SE essen- 
tially performs state estimation using the bus section/switch 
model. Due to the zero impedances though, breaker flows 
are appended to the system state. For regular transmission 
lines of unknown status or parameters, G-SE augments the 
system state by their flows likewise. In any case, to tackle the 
increased state dimensionality, breakers of known status are 
treated as constraints: open (closed) breakers correspond to 
zero flows (voltage drops). Practically, not all circuit breakers 
are monitored; and even for those monitored, the reported 



status may be erroneous |T|. Nowadays, G-SE is further 
challenged: the penetration of renewables and DR programs 
will cause frequent substation reconfigurations. Yet, G-SE can 
be aided by advanced substation automation and contemporary 
intelligent electronic devices (lEDs). 

Identifying substation configuration errors has been tra- 
ditionally treated by extending robust PSSE methods 
(cf. Sec. lIII-B2l) to the G-SE framework. Examples include the 
largest normalized residual test, and the least-absolute value 
and the Huber's estimators IT] Ch. 8]. To reduce the dimen- 
sionality of G-SE, an equivalent smaller-size model has been 
developed in |[26l . The method in l37l leverages advances in 
compressive sampling and instrumentation technology. Upon 
regularizing the G-SE cost by £2-norms of selected vectors, 
it promotes block sparsity on real and imaginary pairs of 
suspected breakers. 

B. Observability, Bad Data, and Cyber-attacks 

The PSSE module presumes that meters are sufficiently 
many and well distributed across the grid so that the power 
system is observable. Since this may not always be the 
case, observability analysis is the prerequisite of PSSE. Even 
when the set of measurements guarantees system state ob- 
servability, resilience to erroneous readings should be so- 
licited by robust PSSE methods. Nonetheless, specific read- 
ings (un)intentionally corrupted can harm PSSE results. This 
section studies these intertwined topics. 

1) Observability Analysis: Given the network model and 
measurements, observability amounts to the ability of uniquely 
identifying the state v. Even when the overall system is unob- 
servable, power system operators are interested in observable 
islands. An observable island is a maximally connected sub- 
grid, whose states become observable upon selecting one of 
its buses as a reference. Identifying observable islands is 
important because it determines which line flows and nodal 
injections can be uniquely recovered. Identifying unobservable 
islands further provides candidate locations for additional 
(pseudo-)measurements needed to restore global observability. 
Pseudo-measurements are prior state information about e.g., 
scheduled generations, forecasted loads, or predicted values 
(based on historical data) to aid PSSE in the form of measure- 
ments with high-variance additive noise (estimation error). 

Due to instrument failures, communication delays, and net- 
work reconfigurations, observability must be checked online. 
The analysis typically resorts to the DC model ([7), and 
hence, it can be performed separately per active and reactive 
subproblems thanks to the P-6 and Q-V decoupling. Since 
power measurements oftentimes come in (re)active pairs, the 
observability results obtained for the active subproblem dS]) 
carry over to the reactive one, assuming additionally that at 
least one nodal voltage magnitude is available per observable 
island (the reactive analogue of the reference bus). 

Commonly used observability checks include topological 
as well as numerical ones; see lU Ch. 4] for a review. 
Topological observability testing follows a graph-theoretic 
approach |[T4l . Given the graph of the grid and the available 
set of measurements, this test builds a maximal spanning tree. 
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Its branches are either lines directly metered or lines incident 
to a metered bus, while every branch should correspond to a 
different measurement. If such a tree exists, the grid is deemed 
observable; otherwise, the so-derived maximal spanning forest 
defines the observable islands. 

On the other hand, numerical observability considers the 
identifiability of the noiseless approximate DC model z = 
no ||58l . Linear system theory asserts that the state 9 is 
observable if H is full column rank. Recall however that active 
power measurements introduce a voltage phase shift ambiguity 
(cf. ©-(111)). That is why a power system with branch-bus 
incidence matrix A is deemed observable simply if AO = 
for every 6 satisfying HO = 0, i.e., mill(H) C null(A). 
Observe now that the entries of AO are proportional to line 
power flows. Hence, intuitively, whenever there is a non-zero 
power flow in the power grid, at least one of its measurements 
should be non-zero for it to be fully observable. When this 
condition does not hold, observable islands can be identified 
via the iterative process developed in ll58l . 

2) Robust State Estimation by Cleansing Bad Data: 
Observability analysis treats all measurements received as 
reliable and trustworthy. Nonetheless, time skews, communica- 
tion failures, parameter uncertainty, and infrequent instrument 
calibration can yield corrupted power system readings, also 
known as "bad data" in the power engineering parlance. If 
bad data pass through simple screening tests, e.g., polarity or 
range checks, they can severely deteriorate PSSE performance. 
Coping with them draws methods from robust statistical SP 
to identify outlying measurements, or at least detect their 
presence in the measurement set. 

Two statistical tests, namely the x^-test and the largest 
normalized residual test (LNRT), were proposed in ll69] Part 
II], and are traditionally used for bad data detection and iden- 
tification, respectively ll57l . H] Ch. 5]. Both tests rely on the 
model z = H.6 + e, assuming a full column rank mxn matrix 
H and a zero voltage phase at the reference bus. The two tests 
check the residual error of the LS estimator which can be 
expressed as r := Pz = Pe, where P := I — H(H^H)~^H-^ 
satisfying P = P-^ = P^. Apparently, when e is standardized 
Gaussian, r is Gaussian too with covariance P; hence, ||r||| 
follows a distribution with (to — ri) degrees of freedom. 
The x^-test then declares an LS-based PSSE possibly affected 
by outliers whenever |jr||2 exceeds a predefined threshold. 

LNRT exploits further the Gaussianity of r. Indeed, as 
Til \fPil should be standard Gaussian for all i when bad data 
are absent, LNRT finds the maximum absolute value among 
these ratios and compares it against a threshold to identify a 
single bad datum 111 Sec. 5.7]. Practically, if a bad datum is 
detected, it is removed from the measurement set, and the LS 
estimator is re-computed. The process is repeated till no bad 
data are identified. Successive LS estimates can be efficiently 
computed using recursive least-squares (RLS). The LNRT is 
essentially the leave-one-out approach, a classical technique 
for identifying single outliers. Interesting links between outlier 
identification and fo-(pseudo)-norm minimization are pre- 
sented in 1421 and 1341 under the Bayesian and the frequentist 
frameworks, respectively. 

Apart from the two tests treating bad data a posteriori. 



outlier-robust estimators, such as the least-absolute deviation, 
the least median of squares, or Huber's estimator have been 
considered too; see |jT]. Recently, ^i-norm based methods have 
been devised; see e.g.. Ell, ||9ni, ll34l . 

Unfortunately, all bad data cleansing techniques are vulner- 
able to the so called "critical measurements" H]. A measure- 
ment is critical if once removed from the measurement set, 
the power system becomes unobservable. If for example one 
removes the current measurement on line (7, 8) from the grid 
of Fig. 12] then bus 8 voltage cannot be recovered. Actually, 
it can be shown that the i-th measurement is critical if the 
i-th column of P is zero, which translates to being always 
zero too. Due to the latter, the LNRT is undefined for critical 
measurements. 

Intuitively, a critical measurement is the only observation 
related to some state. Thus, this measurement cannot be cross- 
validated or questioned as an outlier, but it should be blindly 
trusted. The existence of critical measurements in PSSE 
reveals the connection between bad data and observability 
analysis. Apparently, the notion of critical measurements can 
be generalized to multiple simultaneously corrupted readings. 
Even though such events are naturally rare, their study be- 
comes timely nowadays under the threat of targeted cyber- 
attacks as explained next. 

3) Cyber-attacks: As a complex cyber-physical system 
spanning a large geographical area, the power grid inevitably 
faces challenges in terms of cyber-security. With more data ac- 
quisition and two-way communication required for the future 
grid, enhancing cyber-security is of paramount importance. 
From working experience in dealing with the Internet and 
telecommunication networks, there is potential for malicious 
and well-motivated adversaries to either physically attack 
the grid infrastructure, or remotely intrude the SCADA sys- 
tem. Among all targeted power grid monitoring and control 
operations, the PSSE task in Sec. IIII-AI appears to be of 
extreme interest as adversaries can readily mislead operators 
and manipulate electric markets by altering the system state 

ia, ig. 

Most works analyzing cyber-attacks consider the linear 
measurement model modified as z = 119 + e + a, where 
the attack vector a has non-zero entries corresponding to 
compromised meters. It was initially pointed out in ll50l that 
if the adversary knows H, the attack a can be constructed to 
lie in the range space of H so that the system operator can be 
arbitrarily misled. Under such a scenario, the attack cannot be 
detected. Such attacks are related to the observability and bad 
data analysis described earlier, since by deleting the rows of H 
corresponding to the nonzero entries of a, the resultant system 
becomes unobservable ll42l . Various strategies to construct 
a have been derived in ifSOl . constrained by the number of 
counterfeit meters; see also ll42l for the minimum number of 
such meters. Cyber-attacks under linear state-space models are 
considered in ||63l. 

A major limitation of existing works lies in the linear mea- 
surement model assumption, not to mention the practicality 
of requiring attackers to know the full system configuration. 
Attacks in nonlinear measurement models for AC systems are 
studied in 1971 . Granted that a nonlinear PSSE model can be 
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approximated around a given state point, it is not obvious how 
the attacker can acquire such dynamically varying information 
in real time in order to construct the approximation. This 
requires a per-adversary PSSE and assessment of a significant 
portion of meter measurements. On the defender's side, robus- 
tifying PSSE against bad data is a first countermeasure. Since 
cyber-attacks can be judiciously designed by adversaries, they 
may be more challenging to identify, thus requiring further 
prior information e.g., on the state vector statistics ll42l . 

C. Phasor Measurement Units 

1) Phasor Estimation: PMUs are contemporary devices 
complementing legacy (SCADA) meters in advancing power 
system applications via their high-accuracy and time- 
synchronized measurements ||65l . Different from SCADA me- 
ters which provide amplitude (power) related information, 
PMUs offer also phase information. At the implementation 
level, current and voltage transformers residing at substations 
provide the analog input waveforms to a PMU. After anti- 
alias filtering, each one of these analog signals is sampled at 
a rate several times the nominal power system frequency /o 
(50/60 Hz). If the signal of interest has frequency /o, its phasor 
information (magnitude and phase) can be obtained simply by 
correlating a window of its samples with the sampled cosine 
and sine functions, or equivalently by keeping the first (non- 
DC) discrete Fourier transform component. Such correlations 
can be implemented also recursively. Since power system com- 
ponents operate in the frequency range /o ± 0.5 Hz, acquiring 
phasor information for off-nominal frequency signals has been 
also considered [[65l Ch. 3]. 

The critical contribution of PMU technology to grid instru- 
mentation is time-tagging. Using precise GPS timing (the one 
pulse-per-second signal), synchrophasors are time-stamped at 
the universal time coordinated (UTC). PMU data can thus be 
consistently aggregated across large geographic areas. Apart 
from phasors, PMUs acquire the signal frequency and its 
frequency derivative too. Data from several PMUs are col- 
lected by a phasor data concentrator (PDC) which performs 
time-aligning, local cleansing of bad data, and potentially 
data compression before forwarding data flows to the control 
center. The IEEE standards C37.118. 1/2-2011 determine PMU 
functional requirements. 

2) PMU Placement: Although PMU technology is suf- 
ficiently mature, PMU penetration has been limited so far, 
mainly due to the installation and networking costs involved 
ifTSl . Being the key technology towards wide area monitoring 
though guarantees their wide deployment. During this instru- 
mentation stage, prioritizing PMU locations is currently an im- 
portant issue for utilities and reliability operators worldwide. 
Many PMU placement methods are based on the notion of 
topological observability; cf. Sec. IIII-Bll A search algorithm 
for placing a limited number of PMUs on a maximal spanning 
forest is developed in lIMI . Even though topological observ- 
ability in general does not imply numerical observability, for 
practical measurement matrices it does |57]. In any case, a 
full column rank yet ill-conditioned linear regression matrix 
can yield numericaUy unstable estimators. Estimation accuracy 



rather than observability is probably a more meaningful cri- 
terion. Towards that end, PMU placement is formulated as a 
variation of the optimal experimental design problem in ll48l . 
Il35l . The approach in BSl considers estimating voltage phases 
only, ignores PMU current measurements, and proposes a 
greedy algorithm. In ll35l . the state is expressed in rectangular 
coordinates, all PMU measurements are considered, and the 
SDP relaxation of the problem is solved via a projected 
gradient algorithm. For a detailed review of PMU placements, 
the reader is referred to ll53l . 

3) State Estimation with PMUs: As explained in 
Sec. IIII-All PSSE is conventionally performed using SCADA 
measurements ll84l Ch. 12]. PMU-based PSSE improves esti- 
mation accuracy when conventional and PMU measurements 
are jointly used ll66l . Il65l . However, aggregating conventional 
and synchrophasor readings involves several issues. First, 
SCADA measurements are available every 4 sees, whereas 30- 
60 synchrophasors can be reported per sec. Second, explicitly 
including conventional measurements reduces the Unear PMU- 
based PSSE problem into a non-linear one. Third, compati- 
bility to existing PSSE software and phase alignment should 
be also considered. An approach to address these challenges 
is treating SCADA-based estimates as pseudo-measurements 
during PMU-driven state estimation ll65l . Essentially, the 
slower rate SCADA-based state estimates, expressed in rect- 
angular coordinates, together with their associated covariance 
matrix can be used as a Gaussian prior for the faster rate 
linear PSSE problem based on PMU measurements ll65l . Il35l . 
Regarding phase alignment, as already explained SCADA- 
based estimates assume the phase of the reference bus to be 
zero, whereas PMUs record phases with respect to GPS timing. 
Aligning the phases of the two estimates can be accomplished 
by PMU-instrumenting the reference bus, and then simply 
adding its phase to all SCADA-based state estimates 1651 . 

Synchrophasor measurements do not contribute only to 
PSSE. Several other monitoring, protection, and control tasks, 
ranging from local to interconnection-wide scope can benefit 
from PMU technology. Voltage stability, line parameter esti- 
mation, dynamic line rating, oscillation and angular separation 
monitoring, small signal analysis are just a few entries from 
the list of targeted applications ifTSl . ll65l . 

D. Additional Inference and Learning Issues 

PSSE offers a prototype class of problems that SP tools can 
be readily employed to advance grid monitoring performance, 
especially after leveraging recent PMU technology to com- 
plement SCADA measurements. However, additional areas 
can benefit from SP algorithms applied to change detection, 
estimation, classification, prediction, and clustering aspects of 
the grid. 

1) Line Outage Identification: Unexpected events, such 
as a breaker failure, a tree fall, or a lightning strike, can 
make transmission lines inoperative. Unless the control center 
becomes aware of the outage promptly, power generation and 
consumption will remain almost unchanged across the grid. 
Due to flow conservation though, electric currents will be 
automatically altered in the outaged transmission network. 



9 




Bus with PMU 
Bus without PMU 
Line in outage 



Fig. 4. Internal system identifies line outages occurred in the external system. 

Hence, shortly after, a few operating lines may exceed their 
ratings and successively fail. A cascading failure can spread 
over interconnected systems in a few minutes and eventually 
lead to a costly grid-wide blackout in less than an hour Timely 
identifying line outages, or more generally abrupt changes in 
hne parameters, is thus critical for wide-area monitoring. 

One could resort to the generalized PSSE module to identify 
line outages (cf. Sec. IIlI-A4b . Yet most existing topology 
processors rely on data of the local control area (a.k.a. 
internal) system; see also Fig. ID On the other hand, flow 
conservation can potentially reveal line changes even in exter- 
nal systems. This would be a non-issue if inter-system data 
were available at a sufficiently high rate. Unfortunately, the 
system data exchange (SDX) module of the North-American 
Electric Reliability Corporation (NERC) can provide the grid- 
wide basecase topology only on an hourly basis j76l, while 
the desideratum here is near-real-time line monitoring. In a 
nutshell, each internal system needs to timely identify line 
changes even in the external systems, relying only on local 
data and the infrequently updated basecase topology. 

To concretely lay out the problem, consider the pre- and 
post-event states, and let £ C £ denote the subset of lines 
in outage. Suppose that the interconnected grid has reached 
a stable post-event state, and it remains connected 1761 . 
With reference to the linear DC model in (|8), its post-event 
counterpart reads p' = p + J7 = B^0', where rj captures 
small zero-mean power injection perturbations. Recalling from 
Sec.|n]that = A^DA, the difference B^^ := B^; -B^ can 
be expressed as B-c = J2ee£ xJ^ae^J- With 6 := O' — 9, the 
"difference model" can be written as B0 ~ J2ee£ ™f + 
where to^ aJO' /xg. Mi G £. Based on the latter, to identify 
f of a given cardinality N° := \£\, one can enumerate all 
possible topologies in outage, and select the one offering 
the minimum LS fit. Such an approach incurs combinatorial 
complexity, and has thus limited the existing exhaustive search 
methods to identifying single ||76|, or at most double line 
outages fTTll . A mixed-integer programming approach was 
proposed in 1201 . which again deals with single line outages. 

To bypass this combinatorial complexity, ll98l considers 
an overcomplete representation capturing all possible line 
outages. By constructing an A^; x 1 vector m, whose £-th entry 
equals me, if £ G £, and otherwise, it is possible to reduce 
the previous model to a sparse linear regression one given by 



B^e = A^m + rj. 



(12) 



Since the control center only has estimates of the internal 
bus phases, it is necessary to solve (fT2T i for 6, and extract 
the rows corresponding to the internal buses. This leads to 
a linear model slightly different from (ITZt : but thanks to 
the overcomplete representation, identifying £ amounts to 
recovering m. The key point here is the small number of 
line outages (A^j.o ^ Ni) that makes the sought vector m 
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Fig. 5. Real power flow on a major transmission line during the 1996 Western 
North American power system breakup 11791 . 



sparse. Building on compressive sampling approaches, sparse 
signal recovery algorithms have been tested in [|98l using 
IEEE benchmark systems, and near-optimal performance was 
obtained at computational complexity growing only linearly in 
the number of outages. 

2) Mode Estimation: Oscillations emerge in power systems 
when generators are interconnected for enhanced capacity 
and reliability. Generator rotor oscillations are due to lack 
of damping torque, and give rise to oscillations of bus 
voltages, frequency, and (re)active power flows. Oscillations 
are characterized by the so-termed electromechanical modes, 
whose properties include frequency, damping, and shape p4]|. 
Depending on the size of the power system, modal frequencies 
are often in the range of 0.1 — 2 Hz. While a single generator 
usually leads to local oscillations at the higher range (1—2 Hz), 
inter-area oscillations among groups of generators lie in the 
lower range (0.1 — 1 Hz). Typically, the latter ones are 
more troublesome, and without sufficient damping they grow 
in magnitude and may finally result in even grid breakups. 
Hence, estimating electromechanical modes, especially the 
low-frequency ones, is truly important, and known as the 
small-signal stability problem in power system analysis |44|. 

Albeit near-and-dear to SP expertise on retrieving harmon- 
ics, modal estimation is challenging primarily due to the 
nonlinear and time varying properties of power systems, as 
well as the co-existence of several oscillation modes at nearby 
frequencies. Fortunately, the system behaves relatively linearly 
when operating at steady state, and can thus be approxi- 
mated by the continuous-time vector differential equations 
x(t) = Ax{t)x{t) + B„u(t) + w{t), where the eigenvalues 
of Ax{t) characterize the oscillation modes, and u{t) and 
w(t) correspond to the exogenous input and the random 
perturbing noise, respectively. Assuming linear dynamic state 
models, mode estimation approaches are either model- or 
measurement-based. The former construct the exact nonlinear 
differential equations from system configurations, and then 
linearize them at the steady-state to obtain Ax{t) for esti- 
mating electromechanical modes l55l . In measurement-based 
methods, oscillation modes are acquired directly by peak- 
picking the spectral estimates obtained using linear measure- 
ments x(<) ll79l . Since the complexity of model-based methods 
grows with the network size, scalability issues arise for larger 
systems. With PMUs, modes can be estimated directly from 
synchrophasors, and even updated in real time. 
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Depending on the input u(i), the measurements are either 
ambient, or ring-down (a.k.a. transient), or probing; see e.g.. 
Fig. |5] With only random noise w(t) attributed to load per- 
turbations, the system operates under an equilibrium condition 
and the ambient measurements look like pseudo-noise. A ring- 
down response occurs after some major disturbance, such as 
line tripping or a pulse input u{t), and results in observ- 
able oscillations. Probing measurements are obtained after 
intentionally injecting known pseudo-random inputs (probing 
signals), and can be considered as a special case of ring- 
down data. Missing entries and outliers are also expected in 
meter measurements, hence robust schemes are of interest for 
mode estimation ll94l . Measurement-based algorithms can be 
either batch or recursive. In batch modal analysis, off-line 
ring-down data are modeled as a sum of damped sinusoids 
and solved using e.g., Prony's method to obtain hnear transfer 
functions. Ambient data are handled by either parametric or 
nonparametric spectral analysis methods ||791 . To recursively 
incorporate incoming data, several adaptive SP methods have 
been successfully applied, including least-mean squares (LMS) 
and RLS ||94l . Apart from utilizing powerful statistical SP 
tools for mode estimation, it is also imperative to judiciously 
design efficient probing signals for improved accuracy with 
minimal impact to power system operations [79 1. 

3) Load and Electricity Price Forecasting: Smooth opera- 
tion of the grid depends heavily on load forecasts. Different 
applications require load predictions of varying time scales. 
Minute- and hour-ahead load estimates are fed to the unit 
commitment and economic dispatch modules as described in 
Sec. lIV-Al Predictions at the week scale are used for reliability 
purposes and hydro-thermal coordination; while forecasts for 
years ahead facilitate strategic generation and transmission 
planning. The granularity of load forecasts varies spatially 
too, ranging from a substation, utility, to an interconnection 
level. Load forecasting tools are essential for electricity market 
participants and system operators. Even though such tools 
are widely used in vertically organized utilities, balancing 
supply and demand at a deregulated electricity market makes 
load forecasting even more important. At the same time, the 
introduction of electric vehicles and DR programs further 
complicates the problem. 

Load prediction can be simply stated as the problem of 
inferring future power demand given past observations. Of- 
tentimes, historical and predicted values of weather data (e.g., 
temperature and humidity) are included as prediction variables 
too. The particular characteristics of power consumption ren- 
der it an intriguing inference task. On top of a slowly increas- 
ing trend, load exhibits hourly, weekly, and seasonal period- 
icities. Holidays, extreme weather conditions, big events, or a 
factory interruption create outlying data. Moreover, residential, 
commercial, and industrial consumers exhibit different power 
profiles. Apart from the predicted load, uncertainty descriptors 
such as confidence intervals are important. Actually, for certain 
reliability and security applications, daily, weekly, or seasonal 
peak values are critically needed. 

Several statistical inference methods have been applied 
for load forecasting: ordinary linear regression; kernel-based 
regression and support vector machines; time series analysis 



North American Regional 
Reliability Councils 
> and Interconnections 




Fig. 6. NERC's regional reliability councils and interconnections [Source: 
|http://en.wiEpedia.org/wik]7FiJe:NERC-map-en.svg| 

using auto-regressive (integrated) moving average (with exoge- 
nous variables) models (ARMA, ARIMA, ARIMAX); state- 
space models with Kalman and particle filtering; neural net- 
works, expert systems, and artificial intelligence approaches. 
Recent academic works and current industry practices are 
variations and combinations of these themes reviewed in 
|70i Ch. 2]. Low-rank models for load imputation have been 
pursued in 1541 . 

Load forecasting is not the only prediction task in modem 
power systems. Under a deregulated power industry, market 
participants can also leverage estimates of future electricity 
prices. To appreciate the value of such estimates, consider a 
day-ahead market: an ISO determines the prices of electric 
power scheduled for generation and consumption at the trans- 
mission level during the 24 hours of the following day. The 
ISO collects the hourly supply and demand bids submitted 
by generator owners and utilities. Using the optimization 
methods described later in Sec. IIV-AI the grid is dispatched 
in the most economical way while complying with network 
and reliability constraints. The output of this dispatch are 
the power schedules for generators and utilities, along with 
associated costs. Modern electricity markets are complex. 
Trading and hedging strategies, weather and life patterns, fuel 
prices, government policies, scheduled and random outages, 
reliability rules, all these factors influence electricity prices. 
Even though prices are harder to predict than loads, the task 
is truly critical in financial decision making (3]. The solutions 
proposed so far include econometric methods, physical system 
modeling, time series and statistical methods, artificial intelli- 
gence approaches, and kernel-based approaches; see e.g., |l3l, 
1861 . l36l and references therein. 

4) Grid Clustering: Modularizing power networks is in- 
strumental for grid operation as it facilitates decentralized and 
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parallel computation. Partitioning the grid into control regions 
can also be beneficial for implementing "self-healing" features, 
including islanding under contingencies [|47i . For example, af- 
ter catastrophic events, such as earthquakes, alternative power 
supplies from different management regions may be necessary 
due to power shortage and system instability. Furthermore, 
grid partitioning is essential for the zonal analysis of power 
systems, to aid load reliability assessment, and operational 
market analysis 18J. In general, it is imperative to partition 
the grid judiciously in order to cope with issues involving 
connected or disconnected "subgrids." Regional partitioning 
of the North American grid is illustrated in Fig. |6] where 
each interconnection is further divided into several zones for 
various planning and operation purposes. However, the static 
and manual grid partitioning currently in operation may soon 
become obsolete with the growing incorporation of renewables 
and the overall system scaling. 

The clustering criterion must be in accordance with grid 
partitioning goals. In islanding applications, sub-groups of 
generators are traditionally formed by minimizing the real 
generator-load imbalance to regulate the system frequency 
within each island. Recently, reactive power balance has been 
incorporated in a multi-objective grid partitioning problem to 
support voltage stability in islanding 147] . For these methods, 
it is necessary to reflect the real-time operating conditions that 
depend on the slow-coherency among generators, and the flow 
density along transmission lines. 

Different from the islanding methods that deal with real- 
time contingencies, zonal analysis intends to address the long- 
term planning of transmission systems. Therefore, it is critical 
to define appropriate distance metrics between buses. Most 
existing works on long-term reliability have focused on the 
knowledge of network topology, including the seminal work 
of ll83l . which pointed out the "small-world" effects in power 
networks. To account for the structure imposed by Kirchhoff 's 
laws, it was proposed in [8JI to define "electrical distances" 
between buses using the inverse admittance matrix. 

IV. Optimal Grid Operation 

Leveraging the extensive monitoring and learning modalities 
outlined in the previous section, the next-generation grid will 
be operated with significantly improved efficiency and reduced 
margins. After reviewing classical results on optimal grid dis- 
patch, this section outlines challenges and opportunities related 
to demand-response programs, electric vehicle charging, and 
the integration of renewable energy sources with particular 
emphasis on the common optimization tools engaged. 

A. Economic Operation of Power Systems 

1) Economic Dispatch: Economic dispatch (ED) amounts 
to optimally setting the generation output in an electric power 
network so that the load is served and the cost of generation 
is minimized. ED pertains to generators which consume some 
sort of non-renewable fuel in order to produce electric energy, 
the most typical fuel types being oil, coal, natural gas, or ura- 
nium. In what follows, a prototype ED problem is described, 
with focus placed on a specific time span, e.g. 10 minutes or 



one hour, over which the generation output is supposed to be 
roughly constant. 

Specifically, consider a network with Ng generators. Let 
be the output of the ith generator in MWh. The cost 
of the ith generator is determined by a function Ci{PG^), 
which represents the cost in $ for producing energy of Pq. 
MWh (i.e., maintaining power output Pq^ MW for one 
hour). The cost CiiPd) is modeled as strictly increasing 
and convex, with typical choices including piecewise linear or 
smooth quadratic functions. The output of each generator is 
an optimization variable in ED, constrained within minimum 
and maximum bounds, P™" and Pq^^, determined by the 
generator's physical characteristics ll§4l Ch. 2]. Since once a 
power plant is on, it has substantial power output, P™.™ is 
commonly around 25% of P^^^. 

With Pf, denoting the load forecasted as described in 
Section IIII-D31 the prototype ED problem is to minimize the 
total generation cost so that there is supply-demand balance 
within the generators' physical limits: 

min ^C,(Pg.) (13a) 
subj. to ^Pg, = Pl (13b) 

i=l 

Pg^ ^ Pg, < Par- (13c) 

Problem ( flJl l is convex, so long as the functions Ci{PGi) 
are convex. In this case, it can be solved very efficiently. 
Convex choices of Ci{PGi) offer a model approximating the 
true generation cost quite well and are used widely in the 
literature. Nevertheless, the true cost in practice may not be 
strictly increasing or convex, while the power output may 
be constrained to lie in a collection of disjoint subintervals 
jpmin pmaxj fhcsc Specifications make ED nonconvex, and 
hence hard to solve. A gamut of approaches for solving the 
ED problem can be found in 1841 Ch. 3]. 

Following a duality approach, suppose that Lagrange mul- 
tiplier A corresponds to constraint (113b) . The multiplier has 
units $/MWh, which has the meaning of price. Then, the KKT 
optimality condition implies that for the optimal generation 
output Pq and the optimal multiplier A*, it holds that 

P5. = argmin {^(Pg. ) - A*Pg J, ^ = 1, . . . , TV. 

(14) 

Due to ( fT4b . Ci{PQ,) is the i-th generator's cost in dollars. 
Moreover, if A* is the price at which each generator is getting 
paid to produce electricity, then \*Pq. is the profit for the i-th 
generator Hence, the minimum in (fl4t is the net cost, i.e., the 
cost minus the profit, for generator i. The latter reveals that 
the optimal generation dispatch is the one minimizing the net 
cost for each generator If an electricity market is in place, ED 
is solved by the ISO, with {Ci{PGi)} representing the supply 
bids. 

There are two take-home messages here. First, a very 
important operational feature of an electrical power network is 
to balance supply and demand in the most economical manner, 
and this can be cast as an optimization problem. Second, the 
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Lagrange multiplier corresponding to the supply-demand bal- 
ance equation can be readily interpreted as a price. However, 
the formulation in (fT3] l entails two simplifying assumptions: 
(i) it does not account for the transmission network; and (ii) 
it only pertains to a specific time interval, e.g., one hour In 
practice, the power output across consecutive time intervals is 
limited by the generator physical characteristics. Even though 
the more complex formulations presented next alleviate these 
simplifications, the two take-home messages are still largely 
valid. 

2) Optimal Power Flow: The first generalization is to in- 
clude the transmission network, using the DC load flow model 
of Secinj cf. (O. The resultant formulation constitutes the DC 
optimal power flow (DC OPF) problem |fT2]|. Specifically, it is 
postulated that at each bus there exist a generator and a load 
with output Pg^, and demand Pl^, respectively. The cases 
of no or multiple generators/loads on a bus can be readily 
accommodated. 

Recall from ( fTal i that the real power flow from bus to n 
is approximated by P,„„ « —bmn{dm — dn)- The bus angles 
{Om} are also variables in the DC OPF problem that reads 



min C,„(Pg„) 

rn— 1 

subj. to 



(15a) 



m = 1, . . . ,iV6 



(15b) 

^gI" < ■Pg„, < ^gT' m = l,...,Nt (15c) 

I-Prnni = |6,„„(0™ - 9,,)\ < P^^^ , m, n = 1, . . . , A^fc. 

(15d) 



The objective in (I15al i is the total generation cost. Constraint 
(I15bb is the per bus balance. Specifically, the left-hand side 
of ( I15bb amounts to the net power injected to bus m from 
the generator and the load situated at the bus, while the right- 
hand side is the total power that flows towards all neighboring 
buses. Upon defining vectors for the generator and the load 
powers, ( I15bb could be written in vector form as pg — 
Ba;^ [cf. (O]. Finally, constraint ( I15db enforces power flow 
limits for line protection. 

For convex generation costs CmiPom)' the DC-OPF prob- 
lem is convex too, and hence, efficiently solvable. A major 
consequence of considering per bus balance equations is that 
every bus may have a different Lagrange multiplier The 
pricing interpretation of Lagrange multipliers implies that a 
different price, called locational marginal price, corresponds 
to each bus. The ED problem ( fTSl l can be thought of as a 
special case of DC OPF, where the entire network consists of 
a single bus on which all generators and loads reside. 

Due to the DC load flow approximation, the accuracy of the 
DC OPF greatly depends on how well assumptions (A1)-(A3) 
hold for the actual power system. For better consistency with 
(A2), it is further suggested to penalize the cost (I15ab with 
the sum of squared voltage angle differences X^iinosl^™ ^ 
On)"^, which retains convexity. Even if the DC OPF is a rather 
simplified model for actual power systems, it is worth stressing 



that it is used for the day-to-day operation in several North 
American ISOs. 

Consider next replacing the DC with the AC load flow 
model (cf. Sec. in the OPF context. Generators and loads 
are now characterized not only by their real powers, but also 
the reactive ones, denoted as Qg^ and The AC OPF 

takes the form 



min y CniPoJ 

m— 1 

subj. to 

Qg„, - Ql„, = y Im{5,„„} 



(16a) 



pmin 
^G„, 



< Pg^ < psr; Q'ct < Qg„ 



\Rc{S^n}\ < P^n''; \Sr,^n\<S, 



(16b) 



(16c) 
(16d) 



< Q'gT 



Constraint ( I16bb reveals that now both the real and reactive 
powers must be balanced per bus. Recall further that Smn rep- 
resents the complex power flowing over line (m, n). Therefore, 
the first constraint in (I16db refers to the real power flowing 
over line (m, n) [cf. (I15db l. while the second to the apparent 
power The last constraint in (I16db calls for voltage amplitude 
limits. 

Due to the nonlinear (quadratic equality) couplings between 
the power quantities and the complex voltage phasors, the AC 
OPF in ( fT6] l is highly nonconvex. Various nonlinear program- 
ming algorithms have been applied for solving it, including 
the gradient method, Newton-Raphson, linear programming, 
and interior-point algorithms; see e.g., Il84] Ch. 13]. These 
algorithms are based on the KKT necessary conditions for 
optimality, and can only guarantee convergence to a stationary 
point at best. Taking advantage of the quadratic relations from 
voltage phasors to all power quantities as in SE, the SDR 
technique has been successfully applied, while a zero duality 
gap has been observed for many practical instances of the 
AC OPF, and theoretically established for tree networks; see 
BH, 1451 . and references therein. SDR-based solvers for three- 
phase OPF in distribution networks is considered in ifTTI . 

The AC OPF offers the most detailed and accurate model 
of the transmission network. Two main advantages over its 
DC counterpart are: i) the ability to capture ohmic losses; 
and ii) its flexibility to incorporate voltage constraints. The 
former is possible because the resistive part of the line tt- 
model is included in the formulation. Recall in contrast that 
assumption (Al) in the DC model sets r„i„ — 0. But it is 
exactly the resistive nature of the line that causes the losses. 
In view of ( fT6] l, the total ohmic losses can be expressed as 
X]m(^Gm — PLrri)- Such losscs in the transmission network 
may be as high as 5% of the total load so that they cannot be 
neglected ^ Sec. 5.2]. 

The discussion on OPF — with DC or AC power flow — 
so far has focused on economic operation objectives. System 
reliability is another important consideration, and the OPF can 
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be modified in order to incorporate security constraints too, 
leading to the security-constrained OPF (SCOPF). Security 
constraints aim to ensure that if a system component fails — 
e.g., if a line outage occurs — then the remaining system 
remains operational. Such failures are called contingencies. 
Specifically, the SCOPF aims to find an operating point such 
that even if a line outage occurs, all post-contingency system 
variables (powers, line flows, bus voltages, etc.) are within 
limits. The primary concern is to avoid cascading failures that 
are the main reasons for system blackouts. As explained in 
Sec. IIII-Dll if a line is in outage, the power flows on all other 
lines are adjusted automatically to carry the generated power. 

SCOPF is a challenging problem due to the large number 
of possible contingencies. For the case of the DC OPF, power 
flows after a line outage are linearly related to the flows 
before the outage through the line outage distribution factors 
(LODFs) im, IMl Ch. 11]. The LODFs can be efficiendy 
calculated based on the bus admittance matrix B^; and are 
instrumental in the security-constrained DC OPF. The case of 
AC OPF is much more challenging, and a possible approach 
is enumeration of all possible contingency cases; see e.g., 1841 
Sec. 13.5] for different approaches. 

3) Unit Commitment: Here, the scope of DC OPF is 
broadened to incorporate the scheduling of generators across 
multiple time periods, leading to the so-termed unit commit- 
ment (UC) problem. It is postulated that the scheduling horizon 
consists of periods labeled as 1 . . . , T (e.g., a day consisting 
of 24 1-hour periods). Let Pq^ be the output of the m-th 
generator at period t, and P|^^ the respective demand. The 
generation cost is allowed to be time-varying, and is denoted 
by C^{Pq ). A binary variable per generator and period 
is introduced, so that = 1 if generator m is on at t, and 
= otherwise. Moreover, the mth bus angle at t is denoted 
by 0la- 

Consideration of multiple time periods allows inclusion of 
practical generator constraints into the scheduling problem. 
These are the ramp-up/down and minimum up/down time 
constraints. The former indicate that the difference in power 
generation between two successive periods is bounded. The 
latter mean that if a unit is turned on, it must stay on for 
a minimum number of hours; similarly, if it is turned off, it 
cannot be turned back on before a number of periods. The UC 
problem is formulated as follows. 

T 

min ^ ^ E E [C™(^G,J + SUiuUUo)] 

i^G„,'<^r,^'^mi t=lm=l 

(17a) 

subj. to 

PL=PL~ E braniOl-ei), 

m=l,...,Nb,t = l,...,T (17b) 

nlnP^:' < Ph^^ < uIPg:"", ™ = 1, . . . , iVfc, t = 1, . . . , T 

(17c) 

pt pt — 1 <^ pup. pt — 1 pt <r pdown 

■m=l,...,Nb,t=l,...,T (17d) 
ul - < < , T = t + 1, . . . , min{t + T,^ - 1, T}, 
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Fig. 7. Relationship between the ED, OPF (DC and AC) and UC. From 
left to right, increasing detail in the transmission network model. From top 
to bottom, single- to multi-period scheduling (also applicable to ED and AC 
OPF). 



t = 2,...,T (17e) 



- < 1 - 



T = t + 1, . . . , mm{t + r^°™ -l,T},t^2,...,T (17f) 

l&mn(C-^^.)l <^"r, m,n=l,...,7Vb, t = l,...,T 

(17g) 

ule{0,l}, m = l,...,Nt,t=l,...,T. (17h) 

The term S'*„({uJ„}^^q) in the cost jllaj captures generator 
start-up or shut-down costs. Such costs are generally depen- 
dent on the previous on/off activity. For instance, the more 
time a generator has been off, the more expensive it may be 
to bring it on again. The initial condition u", is known. It 
is also assumed that 6*^(0) = 0. The balance equation is 
given next by ( I17bb . Generation limits are captured by (I17ct . 
Constraint (I17db represents the ramp-up/down limits, where 
the bounds i?^P and R'^^'^ and the initial condition Pj^^ are 
given. Constraint (I17el i means that if generator m is turned 
on at period t, it must remain on for the next periods; 
and similarly for the minimum down time constraint in (|17ft , 
where both T^p and T,f°"" are given |75l. The line flow 
constraints are given by ( |17g[ i, while the binary feasible set 
for the scheduling variables u*„ is shown in ( I17hb . 

It is clear that problem (fTTj i is a mixed integer program. 
What makes it particularly hard to solve is the coupling across 
the binary variables expressed by (II 7e) and (I17fl . Note that 
the DC OPF in ([TSll is a special case of the UC with 
the on/off scheduling fixed and the time horizon limited to 
a single period. It is noted in passing that a multi-period 
version of the DC OPF can also be considered, by adding the 
ramp constraints to (fTSt while keeping the on/off scheduling 
fixed in (fTTT l. therefore obtaining a convex program. Most 
importantly, note that the UC dimension can be brought into 
the remaining two problems described here, that is, the ED and 
the AC OPF. In the latter, the problem has two mathematical 
reasons for being hard, namely, the integer variables and the 
nonconvexity due to the AC load flow. The problems discussed 
here are illustrated in Fig. |7] 

A traditional approach to solving the UC is to apply 
Langrangian relaxation with respect to the balance equations 
Il84l Ch. 5], Q, CS). The dual problem can be solved by 
a non-differentiable optimization method (e.g., a subgradient 
or bundle method), while the Lagrangian minimization step is 
solved via dynamic programming. An interesting result within 
the Lagrangian duality framework is that the duality gap of the 
UC problem without a transmission network diminishes as the 
number of generators increases |]5]. One of the state-of-the-art 



14 



Utility 




Fig. 8. Communications infrastructure facilitating DR capabilities. 

methods for UC is Benders decomposition, which decomposes 
the problem into a master problem and tractable subproblems 
llTOl Ch. 8]. 

B. Demand Response 

Demand response (DR) or load response is the adaptation of 
end-user power consumption to time-varying (or time-based) 
energy pricing, which is judiciously controlled by the utility 
companies to elicit desirable energy usage ||241 . ||29l . The 
smart grid vision entails engaging residential end-users in 
DR programs. Residential loads have the potential to offer 
considerable gains in terms of flexible load response, because 
their consumption can be adjusted — e.g., an air conditioning 
unit (A/C) — or deferred for later or shifted to an earlier 
time. Examples of flexible loads include pool pumps or 
plug-in (hybrid) electric vehicles. The advent of smart grid 
technologies have also made available at the residential level 
energy storage devices (batteries), which can be charged and 
discharged according to residential needs, and thus constitute 
an additional device for control. 

Widespread adoption of DR programs can bring significant 
benefits to the future grid. First, the peak demand is reduced as 
a result of the load shifting capability, which can have major 
economical benefits. Without DR, the peak demand must be 
satisfied by generation units such as gas turbines that can turn 
on and be brought in very fast during those peaks. Such units 
are very costly to operate, and markedly increase the electricity 
wholesale prices. This can be explained in a simple manner by 
recalling the ED problem and specifically (fT4] i. Considering 
a gas turbine that is brought in and does not operate at its 
hmits, (O impUes that A* ~ ^'(^Gt„,bi„o)- Expensive units 
have exactly very high derivative C", that is, increasing their 
power output requires a lot of fuel. 

A second benefit of DR is that it has the potential to 
reduce the end-user bills. This is due to the time-based pricing 
schemes, which encourage consumption during reduced-price 
hours, but also because the wholesale prices become less 
volatile as explained earlier, which means that the electricity 
retailers can procure cheaper sources. A third benefit is that 
DR can strengthen the adoption of renewable energy. The 
reason is that the random and intermittent nature of renewable 
energy can be compensated by the ability of the load to follow 
such effects. More light into the latter concept will be shed in 
Sec. HV^Dl 

DR is facilitated by deployment of the advanced metering 
infrastructure (AMI), which comprises a two-way commu- 
nication network between utility companies and end-users 
(see Fig. [8]l 1241 . 1291 . Smart meters installed at end-users' 
premises are the AMI terminals at the end-users' side. These 



measure not just the total power consumption, but also the 
power consumption profile throughout the day, and report it 
to the utility company at regular time intervals — e.g., every 
ten minutes or every hour The utility company sends pric- 
ing signals to the smart meters through the AMI, for the 
smart meters to adjust the power consumption profile of the 
various residential electric devices, in order to minimize the 
electricity bill and maximize the end-user satisfaction. Energy 
consumption is thus scheduled through the smart meter The 
communication network at the customer's premises between 
the smart meter and the smart appliances' controllers is part 
of the so-called home area network (HAN). 

Time-varying pricing has been a classical research 
topic |To1. The innovation DR brings is that the end-users' 
power consumption becomes controllable, and therefore, part 
of the system optimization. Novel formulations addressing the 
various research issues are therefore called for DR-related 
research issues can be classified in two groups. The first group 
deals with joint optimization of DR for a set of end-users, 
which will be termed hereafter multi-user DR. The second 
group focuses on optimal algorithm design for a single smart 
meter with the aim of minimizing the electricity bill and the 
user discomfort in response to real-time pricing signals. Each 
approach has unique characteristics, as explained next. 

Multi-user DR sets a system-wide performance objective 
accounting for the cost of the energy provider and the user sat- 
isfaction. Joint scheduling must be performed in a distributed 
fashion, and much of the effort is to come up with pricing 
schemes that achieve this goal. Privacy of the customers 
must be protected, in the sense that they do not reveal their 
individual power consumption preferences to the utility, but the 
desired power consumption profile is elicited by the pricing 
signals. One of the chief advantages of joint DR scheduling for 
multiple users is that the peak power consumption is reduced 
as compared to a baseline non-DR approach. The reason is 
that joint scheduling opens up the possibility of loads being 
arranged across time so that valleys are filled and peaks are 
shaved. 

On the other hand, energy consumption scheduling formu- 
lations for a single user can model in great detail the various 
smart appliance characteristics, often leading to difficult non- 
convex optimization problems. This is in contrast with the 
vast majority of multi-user algorithms, which tend to adopt a 
more abstract and less refined description of the end-users' 
scheduling capabilities. More details on the two groups of 
problems are given next. 

1 ) Multi-user DR: Consider R residential end-users, con- 
nected to a single load-serving entity (LSE), as illustrated in 
Fig |9] The LSE can be an electricity retailer or an aggregator, 
whose role is to coordinate the R users' consumption and 
present it as a larger flexible load to the main grid. The time 
horizon consists of T periods, which can be a bunch of 1-hour 
or ten-minute intervals. User r has a set of smart appliances 
Ar- Let be the power consumption of appliance a of user 
r at time period t (typically in kWh), and Pra a T x 1 vector 
collecting the corresponding power consumptions across slots. 

The LSE incurs cost C*(s*) for providing energy s* to the 
users. This cost is essentially the cost of energy procurement 
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Fig. 9. Power network consisting of electricity end-users and the LSE. 



from the wholesale market or through direct contracts with 
energy generation units, and may also include other operation 
and maintenance costs. Each user also adopts a utility func- 
tion Urak^ra), which represents user willingness to consume 
power 

The prototype multi-user DR problem takes the following 
form 

T B. 

min Y.C\s')-Y^ t^™(P™) (18a) 

R 

subj. to = ^ ^ t=l,...,r (18b) 

r=l a^Ar 

Pra e T'ra, a e A, r=l,...,i? (18c) 
ginin < < ^max^ t = 1 , . . . , T. (Igd) 

Clearly, the objective is optimizing the system's social welfare. 
Constraint (llSbb amounts to a balance equation for each pe- 
riod. Moreover, the set Vra in dlSct represents the scheduling 
constraints for every appliance, while constraint (11 8d) bounds 
the power provided by the LSE. 

Problem ( fTSl l is convex as long as C*(s*) is convex, 
f^ra(Pra) IS concavc, and sets Vra are convex. This is typically 
the case, and different works in the literature address DR using 
versions of the previous formulation ifTTI . ||56) , ll68l , ||23l . 
Various examples of appliance models — including batteries — 
together with their utility functions and constraint sets can also 
be found in the aforementioned works. 

Problem (fTSl l as described so far amounts to energy con- 
sumption scheduling. Another instance of DR that can be 
described by the previous formulation is load curtailment. In 
this context, there is an energy deficit in the main grid for a 
particular time period, and the LSE must regulate the power 
consumption to cover for this deficit. The situation can be 
captured in ( fTsT i by setting T = 1 (single time period), and the 
power deficit as s""" = ^max _ ^ jj^g ^^gj. (jt ^^g^ jj^j. ^ffggj- 

the optimization, while the negative of Uraipra) represents the 
discomfort of the end-user due to the power curtailment, so the 
total discomfort — ^ Ura(j>ra) is minimized. This problem 
is addressed in ll62l . ||391 and the references therein. 

One of the main research objectives regarding (ITSl is to 
solve the scheduling problem in a distributed fashion, without 
having the functions ?7ra(Pra) and sets Vra communicated 
to the LSE in order to respect customer privacy. Algorithmic 
approaches typically entail message exchanges between the 
LSE and the users or among the users, and lead to different 
pricing interpretations and models. Specific approaches in- 
clude gradient projection block coordinate descent |f56l ; 
dual decomposition and subgradient method ||23l , 1621 ; the 
Vickrey-Clarke-Groves (VCG) mechanism ll68l : Lagrangian 



relaxation and Newton method ||62l ; and dual decomposition 
with the bisection and Illinois methods 1391 . 

Formulation (fTSl l refers to ahead-of-time scheduling. Real- 
time scheduling is also important. A real-time load response 
approach operating on a second-to-second scale is developed 
in BTl and references thereof. The aim is to have the aggregate 
power consumption of a set of thermostatically controlled 
loads (TCLs), such as A/C units, follow a desired signal. 
Model predictive control is employed to this end. Moreover, in 
order to come up with a simple description of the state space 
model pertaining to the set of TCLs, system identification 
ideas are brought to bear. 

2) Single-user DR: The problems here focus on minimizing 
the total cost due to energy consumption or the peak instanta- 
neous cost over a billing interval (or possibly a combination 
thereof). User comfort levels and preferences must also be 
taken into account. 

Detailed modeling of appliance characteristics and schedul- 
ing capabilities typically introduces integer variables into 
the formulation, which is somewhat reminiscent of the unit 
commitment problem [cf. ^7^]; see e.g., ||64l, EOl, lEll 
and references therein. Solution approaches include stan- 
dard mixed-integer programming techniques — e.g., branch- 
and-bound, Lagrangian relaxation, dynamic programming — as 
well as random search methods such as genetic algorithms and 
particle swarm optimization. An interesting result is that when 
the problem is formulated over a continuous time horizon and 
accounts for the fact that appliances can be turned on or off 
anytime within the horizon, then it has zero duality gap ll22l . 

Real-time approaches have also been pursued. A linear 
programming DR model with robustness against price un- 
certainty and time-series-based price prediction from period 
to period is developed in ITSl . Moreover, ll60l focuses on 
TCLs, and specifically, on a building with multiple zones, with 
each zone having its own heater The aim is to minimize the 
peak instantaneous cost due to the power consumption of all 
heaters, while keeping each zone at a specified temperature 
interval. The problem is tackled through a decomposition into 
a master mixed-integer program and per zone heater control 
subproblems. 

C. Plug-in (Hybrid) Electric Vehicles 

As an important component of the future smart grid vi- 
sion, electric vehicles (EVs) including plug-in (hybrid) EVs 
(P(H)EVs) are receiving a lot of attention. A global driving 
factor behind the research and development efforts on EVs 
is the environmental concern of the greenhouse gases emitted 
by the conventional fossil fuel-based transportation. As the 
future grids accommodate the renewable energy resources in 
an increasing scale, the carbon footprint is expected to be 
markedly curbed by high EV penetration. Electric driving 
also bears strategic relevance in the context of growing in- 
ternational tension over key natural resources including crude 
oil. From the simple perspective of improving overall energy 
efficiency, electrification of transportation offers an excellent 
potential. 

PEVs interact directly with the power grid through plug-in 
charging of built-in batteries. As such, judicious control and 
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optimization of PEV charging pose paramount challenges and 
opportunities for the grid economy and efficiency. Since PEV 
charging constitutes an elastic energy load that can be time- 
shifted and warped, the benefits of DR are to be magnified 
when PEV charging is included in DR programs. In fact, as 
the scale of PEV adoption grows, it is clear that smart coor- 
dination of the charging task will become crucial to mitigate 
overloading of current distribution networks llT3l . Il85l . ifTSl . 
Without proper coordination, PEV charging can potentially 
create new peaks in the load curves with detrimental effects 
on generation cost. On the other hand, it is possible for the 
PEV aggregators that have control over a fleet of PEVs to 
provide ancillary services by modulating the charging rate in 
the vehicle-to-grid (V2G) concept ll72l . This in turn allows the 
utiUties to depend less on conventional generators with costly 
reserve capacities, and facilitates mitigation of the volatility 
of renewable energy resources integrated to the grid [l38l . The 
aforementioned topics are discussed in more detail next. 

1) Coordination of PEV Charging: It is widely recognized 
that uncoordinated PEV charging can pose serious issues on 
the economy of power generation and the quality of power de- 
livered through the distribution networks. PEVs are equipped 
with batteries with sizable capacities, and it is not difficult 
to imagine that most people would opt to start charging their 
vehicles immediately after their evening commute, which is 
the time of the day that already exhibits a significant peak in 
power demand ITSl . Fortunately, the smart grid AMI reviewed 
in Sec. IIV-B] provides the groundwork for effective scheduling 
and control of PEV charging to meet the challenges and sustain 
mass adoption. 

A variety of approaches have been proposed for PEV charg- 
ing coordination. The power losses in the distribution network 
were minimized by optimizing the day-ahead charging rate 
schedules for given PEV charging demands in ifTSI . Real- 
time coordination was considered in fTSl, where the cost due 
to time-varying electricity price as well as the distribution 
losses were minimized by performing a simple sensitivity 
analysis of the cost and accommodating the charging priorities. 
Extending recent results on globally optimal solution of the 
OPF problem via its Lagrangian dual |46|, the optimality 
of similar approaches for PEV coordination problems was 
investigated in ItTI . 

Interestingly, PEV charging can be also pursued in a dis- 
tributed fashion. Further, optimizing feeder losses of distribu- 
tion networks, load factor, and load variance are oftentimes 
equivalent problems 1731 . Leveraging the latter, minimization 
of load variance was investigated in lISTI . Specifically, the 
optimal day-ahead charging profiles r„ := [r„(l), . . . , r„(T)] 
for vehicle n e {1, . . . , N} over a T-slot horizon, are obtained 
by solving 

T / N \ ^ 

ri,...,riv y y 

subj. to r„ < r„ < f„, n = 1, . . . , iV (19b) 

T 

5^r„(i)=B„, n^l,...,N (19c) 



where D{i) is the given base demand, r„ and specify 
the limits on charging rates, and _B„ represents the total 
energy expended for charging PEV n to the desired state-of- 
charge (SoC). The formulation is referred to as "valley-filling" 
in II2TI . as it schedules PEV loads in the valleys of the base 
load curve. 

An optimal solution to ( fT9] l can be obtained iteratively 
ll2n . Supposing that the initial pricing sig nal p^{t) = D{t), 
t = l,2,...,r, and the initial charging profiles rj^(i) are 
identically zero for iteration fc = 0, each PEV n updates 
charging profiles rj^+^ via 

T 

min ^/(t)r„(i) + ^(r„(t)-ri;(0)' (20a) 
t=i 

T 

subj. to r„ < r„ < r„ and^ r„(t) = B„. (20b) 

t=i 

A central entity such as the utility or a PEV aggregator then 
collects the profiles |rjj+^} from all PEVs, and updates the 
pricing signal as 

N 

p''+\t) = D{t) + Y,r'^+\t). (21) 

The new pricing signals are then fed back to the PEVs and 
the procedure iterates until convergence. It is clear from (1211 1 
that the per-vehicle objective in (I20ab corresponds to a first- 
order estimate of the overall objective in (I19at . augmented 
with a proximal term. The overall procedure turns out to be a 
projected gradient search. 

2) Integration with Renewables and V2G: It is only when 
the wide adoption of PEVs is coupled with large-scale inte- 
gration of renewable energy sources that the emission problem 
can be alleviated, as the conventional generation itself con- 
tributes heavily to the emission. However, renewable energy 
sources are by nature intermittent, and often hard to predict ac- 
curately. By allowing the PEV batteries or fuel cells to supply 
their stored power to the grid based on the V2G concept, it was 
observed in 1381 that photovoltaic (PV) resources harnessed by 
the EVs could competitively provide peak power (since the PV 
power becomes highest few hours earlier than the daily load 
peak quite predictably), and large-scale wind power could be 
stabilized for providing base power, via intelligent control. 
For specific control strategies to accomplish such benefits, 
formulations that maximize the profit for providing ancillary 
services were considered in 1 ,72] and references therein. 

3) Charging Demand Prediction: An important prerequisite 
task to support optimal coordination of PEVs is modeling 
and prediction of the PEV charging demand. The probability 
distributions of the charging demand were characterized in 
II5TI and references therein. Spatio-temporal PEV charging 
demand was analyzed for highway traffic scenarios using a 
fluid traffic model and a queuing model in |4|. However, 
there are many interesting issues remaining that deserve further 
research in this forecasting task. 

D. Renewables 

The theme of Sec. IIV-AI has been economic scheduling of 
generators, which consume non-renewable fuels. The subject 
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of the present section is on including generation from renew- 
able energy sources (RESs), with the two prime examples 
being wind and solar energy. RESs are random and inter- 
mittent, which makes them nondispatchable . That is, RESs 
are not only hard to predict, but their intermittency gives 
rise to high variability even within time periods as short 
as 10 minutes. Therefore, they cannot be readily treated as 
conventional generators, and be included in the formulations of 
Sec. IIV-AI In this context, methods for integrating generation 
from RESs to the smart grid operations are outlined next. 

1) Forecast-Based Methods: To illustrate the forecast-based 
methods, recall the ED problem [cf. (fT3]l1, and suppose that 
there is also a wind power generator that can serve the load. 
The output of the wind power generator for the next time 
period is a random variable denoted by W . It is assumed that 
a forecast W is available, and that the wind power generator 
has no cost (as it does not consume fuel). Then, the balance 
constraint is replaced by [cf. (ll3bH 



E 



W 



(22) 



while the remainder of the ED problem remains the same. 
Since the load is actually forecasted (cf. Sec. IIII-D3I ), con- 
straint ( |22] | essentially treats the uncertain RES no different 
than a negative load. 

In order for the forecast to be accurate, the time period of 
ED is recommended to be short, such as 10 minutes. Building 
on this, a multi -period ED is advocated in H32J , where the 
main feature is a model-predictive control approach with a 
moving horizon. Specifically, the ED over multiple periods 
and accompanying forecasts is solved for e.g., 6 ten-minute 
periods representing an hour The generation is dispatched 
during the first period according to the obtained solution. 
Then, the horizon is moved, and a new multi-period ED with 
updated forecasts is solved, whose results are applied only to 
the next period, and so on. Such a method can accommodate 
the ramping constraints, and is computationally efficient. 

2) Chance-Constrained Methods: To account for the ran- 
dom nature of RES in ED, the probability distribution of 
W comes handy. Specifically, the constraint is now that the 
supply-demand balance holds with high probability e, say 
99%. Hence, ( |22] | is substituted by the chance constraint 



Prob 



J2 +W>Pl 



> e. 



(23) 



Note that the equality of the balance equation has been 
replaced by an inequality in ( l23l l, because excess power from 
RESs can in principle be curtailed. 

To solve the chance-constrained ED, the distribution of W 
must be known. For wind power, this is derived from the wind 
speed distribution, and the speed-power output mapping of the 
generator P9| . The most typical speed distribution is Weibull, 
while the speed-power output mapping is nonlinear. Evidently, 
this approach poses formidable modeling and computing chal- 
lenges when multiple RESs and their spatio-temporal correla- 
tion are considered. The probability that the load is not served 



(immediately obtained from the one in ( l23T l) is often called 
loss of load probability. Related sophisticated methods which 
account for chance constraints are also described in ll82l . An 
alternative approach not requiring the joint spatio-temporal 
wind distribution is presented in ||92|. 

3) Robust (Minmax) Optimization: This approach postu- 
lates that the power generation from all RESs across space 
and time belongs to a deterministic uncertainty set. The aim 
is to minimize the worst-case operational costs, while setting 
the dispatchable generation and other optimization variables to 
such levels so that the balance is satisfied for any possible RES 
output within the uncertainty set. The main attractive feature 
here is that no detailed probabilistic models are needed. Only 
the uncertainty set must be obtained, e.g., from historical data, 
or, meteorological factors. 

A robust version of UC [cf. (fTTI il is presented next. Fol- 
lowing the notation of Sec. IIV-AI it is postulated that there 
are RESs with power output W*^ per bus and time period. 
Let w {W^}m,t, and W denote the uncertainty set for w. 
The optimization variables are set in two stages. The on/off 
variables u := are chosen during the first stage. The 

power generation variables and bus angles are set after the RES 
power output is realized — which constitutes the second stage. 
Therefore, the power outputs and bus angles are functions of 
the commitments as well as the RES power outputs, and are 
denoted as P^^,^ (u, w) and 6',^j(u,w). The robust two-stage 
UC problem takes the form 

T 

u.{-PG„(u^w),e^(u,w)} 



T Nt 



subj. to 

(fT7ili.([T7ft.(fT7hli 
(fT7a.(fT7dli,([T7g 



(24a) 
(24b) 



+ E &™„[C(u,w)-0j,(u,w)] 

Vw e W. (24c) 



The objective (I24al i consists of the startup/shutdown costs 
related to the on/off scheduling decisions, as well as the worst- 
case generation costs. The constraints in (I24bt pertain only 
to the on/off variables, and are identical to those in the UC 
problem. The remaining UC constraints must be satisfied for 
all possible realizations of the uncertain RES, as indicated 
in ( l24c] i. 

The solution of problem (l24l i proceeds as follows. The 
on/off decisions it^ determine the UC ahead of the horizon 
{!,..., T}. Then, at each period, after the RES power output 
is realized, functions P^ (u, w) yield the power generation 
dispatch. The punch line of this two-stage robust program 
is that generation becomes adaptive to the RES uncertainty. 
Solution methods typically involve pertinent decompositions 
and approximations 
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Fig. 10. Distributed control and computation architecture of a microgrid sys- 
tem. The microgrid energy manager (MGEM) coordinates the local controllers 
(LCs) of DERs and dispatchable loads. 

A different robust approach for energy management in 
microgrids is pursued in ^9T\. Microgrids are power sys- 
tems comprising many distributed energy resources (DERs) 
and electricity end-users, all deployed across a limited geo- 
graphical area. Depending on their origin, DERs can come 
either from distributed generation (DG), meaning small-scale 
power generators based on fuels or RESs, or from distributed 
storage (DS), such as batteries. The case where a microgrid 
is connected to the main grid, while energy can be sold to 
or purchased from the main grid, is considered in [1911 . The 
approach adopts a worst-case transaction cost. Leveraging the 
dual decomposition, its solution is obtained in a distributed 
fashion by local controllers of the DG units and dispatchable 
loads. 

4) Scenario-based Stochastic Programming: This method 
also amounts to a two-stage adaptive approach, albeit in a 
different manner than the previous one. Here, a discrete set 
of possible scenarios for the RES power output across the 
horizon is considered. For instance, considering 8 hours with 
power output taking 7 possible values, there are 7^ possible 
scenarios. A probability is attached to a each of these scenarios 
(or only to a selection thereof). Similar to (l24aK the objective 
includes startup/shutdown costs due to on/off scheduling. But 
instead of a worst-case part, the expected cost of generation 
dispatch with respect to the scenario probabilities is included 
in the objective. 

The aforementioned approach is pursued in H, whereby 
the scheduling of spinning reserves is also included. Spinning 
reserve is generation capacity that is not currently used to 
serve the load, but is connected to the system (spinning) and 
is available to serve the load in case there is loss of generation. 
Spinning reserves are instrumental components of any power 
system, and the premise here is that they can be provisioned 
in a manner adaptive to the RES uncertainty. 

5} Multi-Stage Stochastic Dynamic Programming: The aim 
here is to address the decision making challenges for an LSE 
obtaining energy from the market as well as from RESs (cf. 
Fig IS- The LSE may procure energy in the day-ahead market, 
as well as in the real-time market, which is a decision made 
on-the-fly during the scheduling horizon. The energy from 
RESs is typically cost-free, but random. In addition, the LSE 
must provide power to the end-users during the horizon, and 
take the associated pricing decisions. The multiple-timescale 
feature reflects exactly the resolution over day-ahead, hour- 
ahead, and real-time (e.g., at the scale of minutes) decisions. 
Multi-stage dynamic programming captures the coupling of 
decisions across time due to end-users' power requirements — 
e.g., total energy requested over a specified interval, or, price- 
adaptive random opportunistic demand 



6) Network Optimization Based on Long-Term Average 
Criteria: This approach relies on queueing-theoretic and 
Lyapunov-based stochastic network optimization methods 
popular in resource allocation tasks for wireless networks . A 
load-serving entity obtaining energy from the market as well 
as from RESs is considered in [i59l , ||3T1 . The objective is 
cost minimization or social welfare maximization in a long- 
term average fashion over an infinite horizon, and the decision 
variables include pricing and power provided to end-users; see 
also [43 | for energy storage management policies. 

V. Open Issues 

Although the SP research efforts on power grid are fast 
growing, there are a lot of open issues awaiting investiga- 
tion. Regarding situational awareness, integrating local power 
grids into interconnections poses modeling and computational 
challenges. Monitoring grids of dimensionality and detail calls 
for scalable and modular algorithms. To communicate and 
process the massive volume of measurements in real time 
with tractable complexity, the issues related to compressing, 
layering, relaying, and storing these data must be considered 
too. The "big data" challenges further extend to addressing 
the missing data and the under-determinacy of the resultant 
systems of equations, as well as model reduction tasks, for 
which contemporary statistical learning approaches could pro- 
vide viable solutions. 

The control and optimization dimensions entail conventional 
generation as well as RESs, interconnected via transmission 
and distribution networks, serving large industrial customers 
and residential end-users with smart appliances and P(H)EVs, 
as well as microgrids with distributed generation and storage. 
SP researchers can cross-fertilize their ample expertise on 
resource allocation gained in the context of communication 
networks to optimize power network operations. Major chal- 
lenges include the successful coordination of system-level 
economic operations such as OPE and UC, while embracing 
small-scale end-users through DR and coordinated P(H)EV 
charging. Integrating random and intermittent RESs across all 
levels poses further challenges. Issues related to leveraging the 
markedly improved monitoring modalities in grid operations 
are worth careful study. Albeit research efforts tackling indi- 
vidual problems have yielded promising outcomes, achieving 
the grand goal of reliable and efficient grid operations still calls 
for novel formulations, insightful approximations, integration, 
and major algorithmic breakthroughs. 
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